Chromatin Immunoprecipitation Sequencing ◾ 219
ENCFF000XKD_chp3.fastq.gz \
ENCFF000XGP_inp0.fastq.gz
Then, we can display the reports in an Internet browser using Firefox command as follows:
firefox \
ENCFF000XJP_chp1_fastqc.html \
ENCFF000XJS_chp2_fastqc.html \
ENCFF000XKD_chp3_fastqc.html \
ENCFF000XGP_inp0_fastqc.html
cd ..
To avoid repeating what had been discussed in Chapter 1, we will assume that the four
FASTQ files are cleaned and ready for the next step.
6.3.3 ChIP-Seq and Input Read Mapping
The second step in the ChIP-Seq data analysis, after data acquisition and quality control,
is aligning both ChIP-Seq reads and input reads to a reference genome, following the same
steps discussed in Chapter 2. You can use an aligner of your choice; however, in this exam-
ple, we will use Bowtie2. First, we need to download the FASTA file of the current human
reference genome from a reliable database such as NCBI and UCSC. We prefer the USCS
reference genome version because the chromosomes are given names instead of acces-
sion numbers. After downloading the compressed FASTQ file, it must be decompressed
using “gunzip” and indexed with “samtools faidx” command. After the file is indexed with
Samtools, we must use Bowtie2 to build an index for the reference FASTA file. The follow-
ing commands, create the directory “ref” where the human reference genome is down-
loaded, decompressed, and indexed with both Samtools and Bowtie2. Refer to Chapter 2
for Samtools and Bowtie2 installation and uses.
mkdir ref; cd ref
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.
fa.gz
gunzip -d hg19.fa.gz
samtools faidx hg19.fa
bowtie2-build hg19.fa hg19
cd ..
Once the above operations have been performed successfully, we can use Bowtie2 to align
both ChIP-Seq reads and control reads to the reference genome; since each file is aligned
separately, four SAM files will be produced.
mkdir bam
bowtie2 \
-p 4 \
-x ref/hg19 \